77 research outputs found
Global disease monitoring and forecasting with Wikipedia
Infectious disease is a leading threat to public health, economic stability,
and other key social structures. Efforts to mitigate these impacts depend on
accurate and timely monitoring to measure the risk and progress of disease.
Traditional, biologically-focused monitoring techniques are accurate but costly
and slow; in response, new techniques based on social internet data such as
social media and search queries are emerging. These efforts are promising, but
important challenges in the areas of scientific peer review, breadth of
diseases and countries, and forecasting hamper their operational usefulness.
We examine a freely available, open data source for this use: access logs
from the online encyclopedia Wikipedia. Using linear models, language as a
proxy for location, and a systematic yet simple article selection procedure, we
tested 14 location-disease combinations and demonstrate that these data
feasibly support an approach that overcomes these challenges. Specifically, our
proof-of-concept yields models with up to 0.92, forecasting value up to
the 28 days tested, and several pairs of models similar enough to suggest that
transferring models from one location to another without re-training is
feasible.
Based on these preliminary results, we close with a research agenda designed
to overcome these challenges and produce a disease monitoring and forecasting
system that is significantly more effective, robust, and globally comprehensive
than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein
and adjust novelty claims accordingly; revise title; various revisions for
clarit
Epidemiological data challenges: planning for a more robust future through data standards
Accessible epidemiological data are of great value for emergency preparedness
and response, understanding disease progression through a population, and
building statistical and mechanistic disease models that enable forecasting.
The status quo, however, renders acquiring and using such data difficult in
practice. In many cases, a primary way of obtaining epidemiological data is
through the internet, but the methods by which the data are presented to the
public often differ drastically among institutions. As a result, there is a
strong need for better data sharing practices. This paper identifies, in detail
and with examples, the three key challenges one encounters when attempting to
acquire and use epidemiological data: 1) interfaces, 2) data formatting, and 3)
reporting. These challenges are used to provide suggestions and guidance for
improvement as these systems evolve in the future. If these suggested data and
interface recommendations were adhered to, epidemiological and public health
analysis, modeling, and informatics work would be significantly streamlined,
which can in turn yield better public health decision-making capabilities.Comment: v2 includes several typo fixes; v3 adds a paragraph on backfill; v4
adds 2 new paragraphs to the conclusion that address Frontiers reviewer
comments; v5 adds some minor modifications that address additional reviewer
comment
Forecasting the 2013--2014 Influenza Season using Wikipedia
Infectious diseases are one of the leading causes of morbidity and mortality
around the world; thus, forecasting their impact is crucial for planning an
effective response strategy. According to the Centers for Disease Control and
Prevention (CDC), seasonal influenza affects between 5% to 20% of the U.S.
population and causes major economic impacts resulting from hospitalization and
absenteeism. Understanding influenza dynamics and forecasting its impact is
fundamental for developing prevention and mitigation strategies.
We combine modern data assimilation methods with Wikipedia access logs and
CDC influenza like illness (ILI) reports to create a weekly forecast for
seasonal influenza. The methods are applied to the 2013--2014 influenza season
but are sufficiently general to forecast any disease outbreak, given incidence
or case count data. We adjust the initialization and parametrization of a
disease model and show that this allows us to determine systematic model bias.
In addition, we provide a way to determine where the model diverges from
observation and evaluate forecast accuracy.
Wikipedia article access logs are shown to be highly correlated with
historical ILI records and allow for accurate prediction of ILI data several
weeks before it becomes available. The results show that prior to the peak of
the flu season, our forecasting method projected the actual outcome with a high
probability. However, since our model does not account for re-infection or
multiple strains of influenza, the tail of the epidemic is not predicted well
after the peak of flu season has past.Comment: Second version. In previous version 2 figure references were
compiling wrong due to error in latex sourc
The Biosurveillance Analytics Resource Directory (BARD): Facilitating the Use of Epidemiological Models for Infectious Disease Surveillance
Epidemiological modeling for infectious disease is important for disease management and its routine implementation needs to be facilitated through better description of models in an operational context. A standardized model characterization process that allows selection or making manual comparisons of available models and their results is currently lacking. A key need is a universal framework to facilitate model description and understanding of its features. Los Alamos National Laboratory (LANL) has developed a comprehensive framework that can be used to characterize an infectious disease model in an operational context. The framework was developed through a consensus among a panel of subject matter experts. In this paper, we describe the framework, its application to model characterization, and the development of the Biosurveillance Analytics Resource Directory (BARD; http://brd.bsvgateway.org/brd/), to facilitate the rapid selection of operational models for specific infectious/communicable diseases. We offer this framework and associated database to stakeholders of the infectious disease modeling field as a tool for standardizing model description and facilitating the use of epidemiological models
Results from the centers for disease control and prevention's predict the 2013-2014 Influenza Season Challenge
Background: Early insights into the timing of the start, peak, and intensity of the influenza season could be useful in planning influenza prevention and control activities. To encourage development and innovation in influenza forecasting, the Centers for Disease Control and Prevention (CDC) organized a challenge to predict the 2013-14 Unites States influenza season. Methods: Challenge contestants were asked to forecast the start, peak, and intensity of the 2013-2014 influenza season at the national level and at any or all Health and Human Services (HHS) region level(s). The challenge ran from December 1, 2013-March 27, 2014; contestants were required to submit 9 biweekly forecasts at the national level to be eligible. The selection of the winner was based on expert evaluation of the methodology used to make the prediction and the accuracy of the prediction as judged against the U.S. Outpatient Influenza-like Illness Surveillance Network (ILINet). Results: Nine teams submitted 13 forecasts for all required milestones. The first forecast was due on December 2, 2013; 3/13 forecasts received correctly predicted the start of the influenza season within one week, 1/13 predicted the peak within 1 week, 3/13 predicted the peak ILINet percentage within 1 %, and 4/13 predicted the season duration within 1 week. For the prediction due on December 19, 2013, the number of forecasts that correctly forecasted the peak week increased to 2/13, the peak percentage to 6/13, and the duration of the season to 6/13. As the season progressed, the forecasts became more stable and were closer to the season milestones. Conclusion: Forecasting has become technically feasible, but further efforts are needed to improve forecast accuracy so that policy makers can reliably use these predictions. CDC and challenge contestants plan to build upon the methods developed during this contest to improve the accuracy of influenza forecasts. © 2016 The Author(s)
National and subnational short-term forecasting of COVID-19 in Germany and Poland during early 2021
We compare forecasts of weekly case and death numbers for COVID-19 in Germany and Poland based on 15 different modelling approaches. These cover the period from January to April 2021 and address numbers of cases and deaths one and two weeks into the future, along with the respective uncertainties. We find that combining different forecasts into one forecast can enable better predictions. However, case numbers over longer periods were challenging to predict. Additional data sources, such as information about different versions of the SARS-CoV-2 virus present in the population, might improve forecasts in the future
- …